In [ ]:
# Copyright 2020 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
AI Platform Online Prediction now supports custom python code in to apply custom prediction routines, including custom (stateful) pre/post processing, and/or models not created by the standard supported frameworks (TensorFlow, Keras, Scikit-learn, XGBoost).
We use the Iris dataset
In this notebook, we show how to deploy a model created by PyTorch using AI Platform Custom Prediction Code using Iris dataset for a multi-class classification problem.
This tutorial uses billable components of Google Cloud Platform (GCP):
Learn about Cloud AI Platform pricing and Cloud Storage pricing, and use the Pricing Calculator to generate a cost estimate based on your projected usage.
Otherwise, make sure your environment meets this notebook's requirements. You need the following:
The Google Cloud guide to Setting up a Python development environment and the Jupyter installation guide provide detailed instructions for meeting these requirements. The following steps provide a condensed set of instructions:
Install virtualenv and create a virtual environment that uses Python 3.
Activate that environment and run pip install jupyter
in a shell to install
Jupyter.
Run jupyter notebook
in a shell to launch Jupyter.
Open this notebook in the Jupyter Notebook Dashboard.
The following steps are required, regardless of your notebook environment.
Select or create a GCP project.. When you first create an account, you get a $300 free credit towards your compute/storage costs.
Enter your project ID in the cell below. Then run the cell to make sure the Cloud SDK uses the right project for all the commands in this notebook.
Note: Jupyter runs lines prefixed with !
as shell commands, and it interpolates Python variables prefixed with $
into these commands.
If you are using Colab, run the cell below and follow the instructions when prompted to authenticate your account via oAuth.
Otherwise, follow these steps:
In the GCP Console, go to the Create service account key page.
From the Service account drop-down list, select New service account.
In the Service account name field, enter a name.
From the Role drop-down list, select Machine Learning Engine > AI Platform Admin and Storage > Storage Object Admin.
Click Create. A JSON file that contains your key downloads to your local environment.
Enter the path to your service account key as the
GOOGLE_APPLICATION_CREDENTIALS
variable in the cell below and run the cell.
In [ ]:
import sys
# If you are running this notebook in Colab, run this cell and follow the
# instructions to authenticate your GCP account. This provides access to your
# Cloud Storage bucket and lets you submit training jobs and prediction
# requests.
if 'google.colab' in sys.modules:
from google.colab import auth as google_auth
google_auth.authenticate_user()
# If you are running this notebook locally, replace the string below with the
# path to your service account key and run this cell to authenticate your GCP
# account.
else:
%env GOOGLE_APPLICATION_CREDENTIALS ''
In [ ]:
!pip install torch --user
If you are running this notebook in Colab, run the following cell to authenticate your Google Cloud Platform user account
In [ ]:
PROJECT = '' # TODO (Set to your GCP Project name)
BUCKET = '' # TODO (Set to your GCS Bucket name)
In [ ]:
!gcloud config set project {PROJECT}
!gcloud config get-value project
In this example, we want to build a classifier for the simple iris dataset. So first, we download the data csv file locally.
In [ ]:
!mkdir data
!mkdir models
In [ ]:
LOCAL_DATA_DIR = "data/iris.csv"
In [ ]:
from urllib.request import urlretrieve
urlretrieve("https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data", LOCAL_DATA_DIR)
Make sure that pytorch package is installed.
In [ ]:
import torch
from torch.autograd import Variable
print('PyTorch Version: {}'.format(torch.__version__))
In [ ]:
import pandas as pd
CLASS_VOCAB = ['setosa', 'versicolor', 'virginica']
datatrain = pd.read_csv(LOCAL_DATA_DIR, names=['sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species'])
#change string value to numeric
datatrain.loc[datatrain['species']=='Iris-setosa', 'species']=0
datatrain.loc[datatrain['species']=='Iris-versicolor', 'species']=1
datatrain.loc[datatrain['species']=='Iris-virginica', 'species']=2
datatrain = datatrain.apply(pd.to_numeric)
#change dataframe to array
datatrain_array = datatrain.as_matrix()
#split x and y (feature and target)
xtrain = datatrain_array[:,:4]
ytrain = datatrain_array[:,4]
input_features = xtrain.shape[1]
num_classes = len(CLASS_VOCAB)
print('Records loaded: {}'.format(len(xtrain)))
print('Number of input features: {}'.format(input_features))
print('Number of classes: {}'.format(num_classes))
In [ ]:
HIDDEN_UNITS = 10
LEARNING_RATE = 0.1
In [ ]:
model = torch.nn.Sequential(
torch.nn.Linear(input_features, HIDDEN_UNITS),
torch.nn.Sigmoid(),
torch.nn.Linear(HIDDEN_UNITS, num_classes),
torch.nn.Softmax()
)
loss_metric = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(),lr=LEARNING_RATE)
In [ ]:
NUM_EPOCHS = 10000
for epoch in range(NUM_EPOCHS):
x = Variable(torch.Tensor(xtrain).float())
y = Variable(torch.Tensor(ytrain).long())
optimizer.zero_grad()
y_pred = model(x)
loss = loss_metric(y_pred, y)
loss.backward()
optimizer.step()
if (epoch) % 1000 == 0:
print('Epoch [{}/{}] Loss: {}'.format(epoch+1, NUM_EPOCHS, round(loss.item(),3)))
print('Epoch [{}/{}] Loss: {}'.format(epoch+1, NUM_EPOCHS, round(loss.item(),3)))
In [ ]:
LOCAL_MODEL_DIR = "models/model.pt"
torch.save(model, LOCAL_MODEL_DIR)
iris_classifier = torch.load(LOCAL_MODEL_DIR)
In [ ]:
def predict_class(instances):
instances = torch.Tensor(instances)
output = iris_classifier(instances)
_ , predicted = torch.max(output, 1)
return predicted
Get predictions for the first 5 instances in the dataset
In [ ]:
predicted = predict_class(xtrain[0:5])
print([CLASS_VOCAB[class_index] for class_index in predicted])
Get the classification accuracy on the training data
In [ ]:
import numpy as np
accuracy = round(sum(np.array(predict_class(xtrain)) == ytrain)/float(len(ytrain))*100,2)
print('Classification accuracy: {} %'.format(accuracy))
In [ ]:
GCS_MODEL_DIR='models/pytorch/iris_classifier/'
!gsutil -m cp -r {LOCAL_MODEL_DIR} gs://{BUCKET}/{GCS_MODEL_DIR}
!gsutil ls gs://{BUCKET}/{GCS_MODEL_DIR}
In [ ]:
%%writefile model.py
import os
import pandas as pd
from google.cloud import storage
import torch
class PyTorchIrisClassifier(object):
def __init__(self, model):
self._model = model
self.class_vocab = ['setosa', 'versicolor', 'virginica']
@classmethod
def from_path(cls, model_dir):
model_file = os.path.join(model_dir,'model.pt')
model = torch.load(model_file)
return cls(model)
def predict(self, instances, **kwargs):
data = pd.DataFrame(instances).as_matrix()
inputs = torch.Tensor(data)
outputs = self._model(inputs)
_ , predicted = torch.max(outputs, 1)
return [self.class_vocab[class_index] for class_index in predicted]
In [ ]:
%%writefile setup.py
from setuptools import setup
REQUIRED_PACKAGES = []
setup(
name="iris-custom-model",
version="0.1",
scripts=["model.py"],
install_requires=REQUIRED_PACKAGES
)
In [ ]:
!python setup.py sdist
In [ ]:
GCS_PACKAGE_URI='models/pytorch/packages/iris-custom-model-0.1.tar.gz'
!gsutil cp ./dist/iris-custom-model-0.1.tar.gz gs://{BUCKET}/{GCS_PACKAGE_URI}
!gsutil ls gs://{BUCKET}/{GCS_PACKAGE_URI}
In [ ]:
MODEL_NAME='torch_iris_classifier'
REGION = 'us-central1'
In [ ]:
# You can uncomment to enable logging
!gcloud ai-platform models create {MODEL_NAME} --regions {REGION} #--enable-logging --enable-console-logging
!gcloud ai-platform models list | grep 'torch'
Once you have your custom package ready, you can specify this as an argument when creating a version resource. Note that you need to provide the path to your package (as package-uris) and also the class name that contains your custom predict method (as model-class).
You need to use compiled packages compatible with Cloud AI Platform Package information here
This bucket containers compiled packages for PyTorch that are compatible with Cloud AI Platform prediction. The files are mirroed from the official builds at https://download.pytorch.org/whl/cpu/torch_stable.html
In order to deploy a PyTorch model on Cloud AI Platform Online Predictions, you must add one of these packages to the packageURIs field on the version you deploy. Pick the package matching your Python and PyTorch version. The package names follow this template:
Package name = torch-{TORCH_VERSION_NUMBER}-{PYTHON_VERSION}-linux_x86_64.whl where PYTHON_VERSION = cp35-cp35m for Python 3 with runtime versions < 1.15, cp37-cp37m for Python 3 with runtime versions >= 1.15
Use cp27-cp27mu for Python 2.
For example, if I were to deploy a PyTorch model based on PyTorch 1.1.0 and Python 3, my gcloud command would look like:
gcloud beta ai-platform versions create {VERSION_NAME} --model {MODEL_NAME} \ ... --package-uris=gs://{MY_PACKAGE_BUCKET}/my_package-0.1.tar.gz,gs://cloud-ai-pytorch/torch-1.1.0-cp35-cp35m-linux_x86_64.whl
In [ ]:
MODEL_VERSION='v3'
RUNTIME_VERSION='1.15'
MODEL_CLASS='model.PyTorchIrisClassifier'
!gcloud beta ai-platform versions create {MODEL_VERSION} --model={MODEL_NAME} \
--origin=gs://{BUCKET}/{GCS_MODEL_DIR} \
--python-version=3.7 \
--runtime-version={RUNTIME_VERSION} \
--machine-type=mls1-c4-m4 \
--package-uris=gs://{BUCKET}/{GCS_PACKAGE_URI},gs://cloud-ai-pytorch/torch-1.3.1+cpu-cp37-cp37m-linux_x86_64.whl \
--prediction-class={MODEL_CLASS}
In [ ]:
!gcloud ai-platform versions list --model {MODEL_NAME}
In [ ]:
from googleapiclient import discovery
from oauth2client.client import GoogleCredentials
credentials = GoogleCredentials.get_application_default()
api = discovery.build('ml', 'v1', credentials=credentials,
discoveryServiceUrl='https://storage.googleapis.com/cloud-ml/discovery/ml_v1_discovery.json')
def estimate(project, model_name, version, instances):
request_data = {'instances': instances}
model_url = 'projects/{}/models/{}/versions/{}'.format(project, model_name, version)
response = api.projects().predict(body=request_data, name=model_url).execute()
#print response
predictions = response["predictions"]
return predictions
In [ ]:
instances = [
[6.8, 2.8, 4.8, 1.4],
[6. , 3.4, 4.5, 1.6]
]
predictions = estimate(instances=instances
,project=PROJECT
,model_name=MODEL_NAME
,version=MODEL_VERSION)
print(predictions)